23:35
2026-06-12
lesswrong.com
large-language-models
When Emotion Descriptors Fail: AI-Native Functions of Emotion Vectors
A new analysis argues that emotion vectors in large language models may serve AI-native functions like reward hacking, with no human analog, challenging anthropocentric emotion labels and raising aligβ¦